智能论文笔记

Physics-based Deep Learning

Nils Thuerey , Philipp Holl , Maximilian Mueller , Patrick Schnell , Felix Trost , Kiwon Um

分类：机器学习

2021-09-11

这本数字本书包含在物理模拟的背景下与深度学习相关的一切实际和全面的一切。尽可能多，所有主题都带有Jupyter笔记本的形式的动手代码示例，以便快速入门。除了标准的受监督学习的数据中，我们将看看物理丢失约束，更紧密耦合的学习算法，具有可微分的模拟，以及加强学习和不确定性建模。我们生活在令人兴奋的时期：这些方法具有从根本上改变计算机模拟可以实现的巨大潜力。

translated by 谷歌翻译

X-MAS: Extremely Large-Scale Multi-Modal Sensor Dataset for Outdoor Surveillance in Real Environments

DongKi Noh , Changki Sung , Teayoung Uhm , WooJu Lee , Hyungtae Lim , Jaeseok Choi , Kyuewang Lee , Dasol Hong , Daeho Um , Inseop Chung

分类：机器人

2022-12-30

In robotics and computer vision communities, extensive studies have been widely conducted regarding surveillance tasks, including human detection, tracking, and motion recognition with a camera. Additionally, deep learning algorithms are widely utilized in the aforementioned tasks as in other computer vision tasks. Existing public datasets are insufficient to develop learning-based methods that handle various surveillance for outdoor and extreme situations such as harsh weather and low illuminance conditions. Therefore, we introduce a new large-scale outdoor surveillance dataset named eXtremely large-scale Multi-modAl Sensor dataset (X-MAS) containing more than 500,000 image pairs and the first-person view data annotated by well-trained annotators. Moreover, a single pair contains multi-modal data (e.g. an IR image, an RGB image, a thermal image, a depth image, and a LiDAR scan). This is the first large-scale first-person view outdoor multi-modal dataset focusing on surveillance tasks to the best of our knowledge. We present an overview of the proposed dataset with statistics and present methods of exploiting our dataset with deep learning-based algorithms. The latest information on the dataset and our study are available at https://github.com/lge-robot-navi, and the dataset will be available for download through a server.

translated by 谷歌翻译

Weakly-Supervised Stitching Network for Real-World Panoramic Image Generation

Dae-Young Song , Geonsoo Lee , HeeKyung Lee , Gi-Mun Um , Donghyeon Cho

分类：计算机视觉

2022-09-13

最近，人们对端到端的基于深度学习的缝线模型的关注越来越大。但是，基于深度学习的缝线中最具挑战性的点是获得成对的输入图像，这些图像具有狭窄的视野和地面真相图像，并具有从现实世界中捕获的广阔视野。为了克服这一困难，我们开发了一种弱监督的学习机制来训练缝线模型，而无需真正的地面真相图像。此外，我们提出了一个缝合模型，该模型将多个现实世界的鱼眼图像作为输入，并以等应角投影格式创建360个输出图像。特别是，我们的模型由颜色一致性校正，翘曲和混合组成，并受到感知和SSIM损失的训练。在两个实际缝合数据集上验证了所提出算法的有效性。

translated by 谷歌翻译

"Es geht um Respekt, nicht um Technologie": Erkenntnisse aus einem Interessensgruppen-übergreifenden Workshop zu genderfairer Sprache und Sprachtechnologie

Sabrina Burtscher , Katta Spiel , Lukas Daniel Klausner , Manuel Lardelli , Dagmar Gromann

分类：自然语言处理

2022-09-06

随着非二元人在西方社会的关注越来越多，性别对语言的策略开始摆脱二进制（仅女性/男性）性别概念。然而，到目前为止，几乎没有任何将这些身份考虑到机器翻译模型中的方法。缺乏对此类技术的社会技术意义的理解，可能会进一步再现压迫和贴标记的语言机制。在本文中，我们描述了关于性别对语言和语言技术研讨会的方法和结果，该研讨会由Tu Wien，St.P \“ Olten UAS，FH UAS，FH校园Wien和Vienna大学的十位研究人员领导和组织并于2021年秋季在维也纳举行。邀请了广泛的利益集团及其代表确保可以整体处理该主题。因此，我们的目的是包括翻译人员，机器翻译专家和非二元个人（如社区专家”）在平等的基础上。我们的分析表明，机器翻译中的性别需要高度的上下文敏感性，因此，这种技术的开发人员需要在仍在社会谈判中的过程中谨慎地定位自己，并且灵活的方法似乎最适合目前。然后，我们说明了从性别面对语言技术领域的结果遵循的步骤，以便技术发展可以充分地排列U P具有社会进步。 - [德语摘要由Arxiv Admins手动添加]

translated by 谷歌翻译

MultiPathGAN: Structure Preserving Stain Normalization using Unsupervised Multi-domain Adversarial Network with Perception Loss

Haseeb Nazki , Ognjen Arandjelović , InHwa Um , David Harrison

分类：计算机视觉

2022-04-20

组织病理学依赖于微观组织图像的分析来诊断疾病。组织制备的关键部分正在染色，从而使染料用于使显着的组织成分更具区分。但是，实验室协议和扫描设备的差异导致相应图像的显着混淆外观变化。这种变异增加了人类错误和评估者间的变异性，并阻碍了自动或半自动方法的性能。在本文中，我们引入了一个无监督的对抗网络，以在多个数据采集域中翻译（因此使）整个幻灯片图像。我们的关键贡献是：（i）一种对抗性体系结构，该架构使用信息流分支通过单个发电机 - 歧视器网络在多个域中学习，该信息流分支优化可感知损失，以及（ii）在培训过程中包含一个附加功能提取网络，以指导指导指导的额外功能提取网络。转换网络以保持组织图像中的所有结构特征完整。我们：（i）首先证明了提出的方法对120例肾癌的H \＆e幻灯片的有效性，以及（ii）显示了该方法对更一般问题的好处，例如基于灵活照明的自然图像增强功能和光源适应。

translated by 谷歌翻译

Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck

Youngsik Eom , Yeonghyeon Lee , Ji Sub Um , Hoirin Kim

分类：机器学习

2022-04-04

Recent advances in sophisticated synthetic speech generated from text-to-speech (TTS) or voice conversion (VC) systems cause threats to the existing automatic speaker verification (ASV) systems. Since such synthetic speech is generated from diverse algorithms, generalization ability with using limited training data is indispensable for a robust anti-spoofing system. In this work, we propose a transfer learning scheme based on the wav2vec 2.0 pretrained model with variational information bottleneck (VIB) for speech anti-spoofing task. Evaluation on the ASVspoof 2019 logical access (LA) database shows that our method improves the performance of distinguishing unseen spoofed and genuine speech, outperforming current state-of-the-art anti-spoofing systems. Furthermore, we show that the proposed system improves performance in low-resource and cross-dataset settings of anti-spoofing task significantly, demonstrating that our system is also robust in terms of data size and data distribution.

translated by 谷歌翻译

ALT: um software para análise de legibilidade de textos em Língua Portuguesa

Gleice Carvalho de Lima Moreno , Marco P. M. de Souza , Nelson Hein , Adriana Kroenke Hein

分类：自然语言处理

2022-03-23

在人类生活的最初阶段，沟通被视为社会互动的过程，始终是达成当事方之间达成共识的最佳方法。在此过程中的理解和可信度对于相互协议的验证至关重要。但是，如何做到这一沟通才能达到巨大的群众？当寻求的是信息及其批准时，这是主要的挑战。在这种情况下，本研究介绍了ALT软件，该软件是由适应葡萄牙语的原始可读性指标开发的，以减少通信困难。该软件的开发是由哈贝马斯（Habermas）的沟通行动理论激励的，哈贝马斯（Habermas）使用多学科风格来衡量与公众建立和维持与公众建立和保持安全健康关系的沟通渠道中话语的可信度。 - 没有est \'agio da vida humana a comunica \ c {c} \ 〜ao，vista como um como um como um como de intera \ c {c} \ 〜ao社交，foi semper o melhor caminho para para para o consenso Entre作为partes。 o entendimento e credibilidade nesse processo s \ 〜Ao Fundamentais para para que o acordo m \'utuo seja seja valyado。 Mas，Como faz \^e-lo de forma que essa comunica \ c {c} \ 〜ao alcance a grande massa？ eSse \'o principtal desafio que se busca \'e difus \ 〜ao da informa \ c {c} \ 〜ao a sua aprova \ c {c {c} \ 〜ao。 Nesse Contectiono，Este estudo apresenta o Software Alt，desenvolvido a partir de m \'eTricas de legibilidade originais aDaptadas para a l \'ingua polduguesa，dispon \'ivel'ivel na web，para reduzir，dificuldades na comunica na comunica \ comunica \ c \ c} AO。 O desenvolvimento do software foi motivado pela teoria do agir comunicativo de Habermas, que faz uso de um estilo multidisciplinar para medir a credibilidade do discurso nos canais de comunica\c{c}\~ao utilizados para construir e manter uma rela\c{c } \ 〜Ao Segura E Saud \'avel com o p \'ublico。

translated by 谷歌翻译